Stochastic Cubic Regularization for Fast Nonconvex Optimization
نویسندگان
چکیده
This paper proposes a stochastic variant of a classic algorithm—the cubic-regularized Newton method [Nesterov and Polyak, 2006]. The proposed algorithm efficiently escapes saddle points and finds approximate local minima for general smooth, nonconvex functions in only Õ( −3.5) stochastic gradient and stochastic Hessian-vector product evaluations. The latter can be computed as efficiently as stochastic gradients. This improves upon the Õ( −4) rate of stochastic gradient descent. Our rate matches the bestknown result for finding local minima without requiring any delicate acceleration or variance-reduction techniques.
منابع مشابه
Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization
The popular cubic regularization (CR) method converges with firstand second-order optimality guarantee for nonconvex optimization, but encounters a high sample complexity issue for solving large-scale problems. Various sub-sampling variants of CR have been proposed to improve the sample complexity. In this paper, we propose a stochastic variance-reduced cubic-regularized (SVRC) Newton’s method ...
متن کاملA SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning
In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorith...
متن کاملA SMART STOCHASTIC ALGORITHM FOR NONCONVEX OPTIMIZATION A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning
Machine learning theory typically assumes that training data is unbiased and not adversarially generated. When real training data deviates from these assumptions, trained models make erroneous predictions, sometimes with disastrous effects. Robust losses, such as the huber norm, were designed to mitigate the effects of such contaminated data, but they are limited to the regression context. In t...
متن کاملStochastic Variance-Reduced Cubic Regularized Newton Method
We propose a stochastic variance-reduced cubic regularized Newton method for non-convex optimization. At the core of our algorithm is a novel semi-stochastic gradient along with a semi-stochastic Hessian, which are specifically designed for cubic regularization method. We show that our algorithm is guaranteed to converge to an ( , √ )-approximately local minimum within Õ(n/ ) second-order oracl...
متن کاملStochastic Particle Gradient Descent for Infinite Ensembles
The superior performance of ensemble methods with infinite models are well known. Most of these methods are based on optimization problems in infinite-dimensional spaces with some regularization, for instance, boosting methods and convex neural networks use L1-regularization with the non-negative constraint. However, due to the difficulty of handling L1-regularization, these problems require ea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.02838 شماره
صفحات -
تاریخ انتشار 2017